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Abstract 

We consider regularly varying random vectors. Our goal is to estimate 
in a non-parametric way some characteristics related to conditioning on 
an extreme event, like the tail dependence coefficient. We introduce a 
quasi-spectral decomposition that allow to improve efficiency of estima¬ 
tors. Asymptotic normality of estimators is based on weak convergence of 
tail empirical processes. Theoretical results are supported by simulation 
studies. 


1 Introduction 


Assume that {X,Y) is a regularly varying random vector with index a and F 
is the marginal distribution of X. When dealing with extreme observations, we 
are often interested in estimating 


E 


X X 


I {X, Y) G xC 


( 1 ) 


_2 

where b : t R, U is a suitably chosen subset of R \ {0} and x is large. For 

example, x can be chosen as x = Xp = F'^{1 — p), where p is small (The value 
Xp is called in financial applications the Value-at-Risk). Special cases include 
estimation of the conditional tail distribution 


1 — G{y) = lim P(V > yx \ X > x) , (2) 

x—>-oo 


estimation of the conditional tail expectation (expected shortfall) 


lim E [iY/x) I V > a;] , 


or extremal dependence measure 


lim E 

x—^ca 


XY 

JI(2f,V)lk 


I ||(X,V)|| >a; 
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where || • || is a vector norm on The first problem is linked to estimation 
of the tail dependence coefficient, the second one to modeling of the expected 
shortfall ([!]), while the last one was introduced and studied in Ba¬ 
in specific cases estimators of o can be obtained in a parametric or semi- 
parametric way and rely on a particular model chosen. Alternatively, one can 
consider nonparametric approaches (see PJ Chapter 9] for related theory and 
methods, as well as an extensive list of references). Specifically, having an i.i.d. 
sample {Xi,Yi), i = l,...,n, from (X, F), estimation of the conditional tail 
distribution in (P|) can be achieved by 


1 

k 


i=i 


(3) 


where fc is a deterministic sequence such that A: —>■ oo, fc/n —^ 0 and Xn-.i < 
■ ■ ■ ^ Xn-.n are order statistics. However, in order to provide reliable estimates 
of the conditional tail distribution one needs an appropriate number of pairs 
of observations such that the both components exceed the level Xn-.n-k- This 
usually requires a very large number of observations. In summary, the estimator 
([^ may not be particularly useful in practice. 


We propose an alternative nonparametric approach to estimating the con¬ 
ditional tail distribution and more generally to estimating the expressions like 
the one in ©■ The idea comes from [I] , who considered regularly varying time 
series and defined a spectral and a tail spectral process. More specifically, in 
our context of bivariate vectors, regular variation implies that (X/a;,F/a;) con¬ 
ditionally on X > a; converges in distribution (when x —>■ oo) to a random 
vector (Hi0i,Tl 02), where Vi has a standard Pareto distribution, 0i is con¬ 
centrated at {—1,1}, while (0i, 02) is independent of Vi. Furthermore, 02 is a 
distributional limit of Y/X given that X > x and a; —>■ oo. The representation 
of the limiting vector is similar to the standard spectral decomposition (see [H 
Section 8.2.3] or [151 Section 6.1.2]), however, in our case the vector (0i,02) 
does not lie on a unit circle. Hence, we will call (Vi,0i,02) the quasi-spectral 
decomposition. 

As a consequence, if we assume for simplicity that all random variables are 
nonnegative, then the conditional tail distribution can be expressed in terms of 
02 as 


X > cc) = E 


= lim E 

lX>x 


LV y J \ 

x—^oo 

[\yx J \ 


Thus, the estimator m can be replaced with 


1 

k 





(4) 


We will argue below that the estimator ([4]) is more efficient than the one in ([3]) 
(see also in a different context of time series). Of course, if a is unknown. 
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it needs to be replaced with its estimator, however, we will provide conditions 
that guarantee that estimation of a does not influence the limiting behaviour 
of the estimator of the conditional tail distribution. This observation will be 
also confirmed by simulation studies. Also, we note that the bivariate case can 
be easily extended to a general multivariate situation, still requiring only one 
component to be large. 

Furthermore, the quasi-spectral decomposition can be useful in approximat¬ 
ing the expected shortfall. It turns out that 

lim x-^V.[Y I X > a:] = E[yi]E[02] = —^ lim E [{Y/X) \ X > x] 

X^OO Q — 1 X—^OO 

whenever a > 1. Using the above identity we can construct two estimators 
of the expected shortfall. Asymptotic normality of an estimator that is based 
on the left-hand side of the above expression requires finiteness of the second 
moment, while an estimator motivated by the quasi-spectral representation on 
the right-hand side may have finite variance even when a G (1, 2). 

In summary, the proposed estimation procedure based on the quasi-spectral 
representation may lead to improvement in terms of efficiency or in terms of 
the conditions required to achieve asymptotic normality, as compared to other 
nonparametric methods. 

In order to support our statement, we proceed as follows. In Section!^ we 
recall the concept of multivariate regular variation (see [E]), followed by the 
quasi-spectral decomposition (Section 12.2|) . We link it to the conditional tail 
distribution (Section 1^31) and the conditional tail expectation (Section l^ITl) . We 
note that we present that section in a general framework of d-dimensional vec¬ 
tors. In Section[3]we consider weak convergence of tail empirical processes based 
on deterministic and random levels. The theory is used to construct estimators 
of (IlJ. Furthermore, some of the results in [T^ and [4] can be concluded from 
ours. The specific cases of the conditional tail distribution and the conditional 
tail expectation are discussed in Sections H] and [SJ respectively. In the latter 
section we link our results to the estimation procedure in [4]. In Section E] 
we conduct extensive simulation studies that show usefulness of our approach, 
while in the following one we apply our procedure to estimation of the tail de¬ 
pendence coefficient for some real data. Some technical details of proofs can be 
found in Section [S] We finish our paper by addressing several technical issues 
like different marginals and directions of future research. 


2 Preliminaries 

We start with some notation that will be used throughout the paper. Unless 
otherwise stated, by y we denote a vector (j/i, ..., yd). For a vector y we write 
(y, oo] = (j/i, oo] X • • • X {yd, oo]. For C C and y > 0 we denote yC = {yx : 
X G C}. As usual, for a given distribution F, we write F(x) = 1 — F(x). 
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2.1 Multivariate regular variation 

We start with the following definition (see e.g. [151 Theorem 6.1]). 

Definition 1. A vector X = {Xi,..., Xd) in is (multivariate) regularly 
varying if there exists a non zero Radon measure vx on K \ {0}, called the 
exponent measure of X, such that \ = 0 and a scaling sequence 

{cn} such that the measure nP(c“^X G •) converges vaguely on R \ { 0 } to the 
measure vx, te. 

nP(c“^X G •) “^ , on R"^ \ { 0 } . (5) 

The limiting measure is homogeneous with some index —a, that is vx (jjC) = 
y~°‘izx (C) for any y > 0 and a relatively compact set C. We call —a the index 
of regular variation of X . 

In what follows, we will assume that all components Xi have the same dis¬ 
tribution F (see also Section |5] for extensions) and are nonnegative (the latter 
assumption is purely technical and can be easily relaxed). Then 

l._^ P(.--X g .4) ^ 

x^oo F{x) t^x{{x : |a;i| > 1}) 


2.2 Quasi-spectral decomposition 


We can link vague convergence to weak convergence of conditional probabilities. 
In particular, for relatively compact sets A, B in R* \ {0}, R * \ {0}, 


lim P(a: ^X G A x {y,oo] x B \ Xi > x) = 

x—^oo 


i^x {A X (y,oo] X B) 
z^x(R*”^ X (l,oo] X !''■*) 


In this spirit, regular variation implies a quasi-spectral decomposition. In time 
series context this approach was used in [T] . 


Proposition 1. Let X he a regularly varying random vector with non-negative 
regularly varying components with index —a. Then conditionally on Xi > x, as 

X ^ oo 


x-\X,,...,Xd) , 


Xj 


converge in distribution to (Vi,..., Vd) and (Vi, 02 ,..., 0d), where 


1. Vi has the Pareto distribution with index —a; 


2. Qj = Vj/Vi, j = 2,..., d and (02,..., 0d) is independent of Vi. 

Proof. A proof is given in Section 18.11 □ 


Remark 1. Throughout the paper the quasi spectral-decomposition into Vi 
and (02, ■.., 0d) is obtained by conditioning on Xi. We can condition on Xj 
for any j. Note however that for each different j we get different vectors V 
(that depend formally on j). 
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2.3 Representation of conditional tail distribntion 

We use the quasi-spectral representation to express the conditional tail distri¬ 
bution. 


Corollary 2. Let X be a regularly varying random vector with non-negative 
regularly varying components with index —a. Then for j 2 , ■ ■ ■ ,ji,ji+i, ■ ■ ■ ,jd G 
{1,..., d} and pj > 0 we have 


hm > yp+^x, yj^x I Xi > X, Xj^ > x,...,Xj^ > x) 


x—^oo 

E 




A0“ A...0“ A1 


E[0“ A...0“A1] 

Proof. Proposition [T] implies that for yi > 1, y 2 , ■ ■ ■ ,yd > 0, 


( 6 ) 


lim F{Xi > yix, ...,Xd> ydX \ Xi > x) = P(Vi > yi,... ,Vd > yd) 

x—>-oo 

= P(yi > yi, Pl02 > 2 / 2 , ■ • ■, ViQd > yd) 


i 


2 > y 2 /u,...,Qd > yd/uf 


= a 


= E 


yivi 

pOO 

I ] 

yivi 
1 


y2 


A , 

A — A • ■ 
2/1 2/2 


> u 

yd 


yd 


I “ ^dw 


> u 


-a-l 


du 


Furthermore, 


p yji+i^^ ■ • ■ 1 ^jd p yjd^ I ^1 ^ Xj^ > X,..., Xji > x) 

_ ^i^j2 P X,..., Xj^ > X, > yj^^.^x, • ■ •, Xj^ > yj^x \ X\ > x) 

F{Xj^ > X,..., Xj, > X I Xi > x) 


and the result follows. 


□ 


We note that the numerator and the denumerator in © can be expressed 
as limits. In particular, via Proposition [1] the numerator in equals 


lim E 

X—>-oo 
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Xi > X 


with a bounded and continuous function g{u 2 , ■ • ■, Ud) = {u 2 A ■■■ A Ud A 1)“. 
Consequently, for y > 0, and setting (Xi,X 2 ) = {X,Y) 


lim P(y > yx 

x—>-oo 


X > x) = E 


= lim E 

(^Al) |X>x 


[\y J \ 

rc^oo 

[\yx J \ 


(7) 
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2.4 Representation of conditional tail expectation 

Corollary 3. Let X be a regularly varying random vector with non-negative 
regularly varying components with index —a. Assume moreover that for some 
6 > 0 we have 


supE 

x>0 


Xu 


1+5 


Xi > X, Xj^ > X,..., Xjj > X 


< oo . 


( 8 ) 


Then 


E 




Xi > X, Xj^ > X,, Xji > X 


E 


0,, (0,, A • • • A 0„ A 1)^ 


a - 1 E[0“ A ... A 0“ A 1] 


Proof. We note first that ([8]) implies that a > 1. Let A C ( 0 , 00 ). Proposition[T] 
implies that as a; —>■ 00 


E 






Xi > X, Xj., > X,..., Xj, > X 


E[^dl{i<y,^<A}l{Vi>i,v,v,>i.. 
E[0“ A ... A 0“ A 1] 


A computation similar to Corollary [2] yields that the numerator in the last 
expression is 


a 

a — 1 


■E 




A 0„ A 1) 


O' — 



Furthermore, ([8]) implies 


lim limsupE 
^—>^00 x—^00 


X, 


X^Xj^>xA} I Xi > X, Xj.^ > X,..., Xjj^ > X 


< lim A '’limsupE 

-^^00 ai^oo 


Xu 


1+5 


) \ Xi > X, Xj^ > X,..., Xj^ > X 


= 0 . 


□ 


In particular, if a > 1 then setting again {Xi,X 2 ) = {X,Y), 


lim E 

Y 

— \ X > X 

- “ E [02] - “ lim E 

Y 

-— \ X > X 

x—>-oo 

X 

a — 1 a — 1 x^oo 

X ' 


= ■ I'lcTE 


(9) 

and the limit is strictly positive in case of extremal dependence, that is when 
the limiting exponent measure t'x in (IS]) is not concentrated on the axes. 


3 Weak convergence of tail empirical process 

For clarity of notation we consider the case d = 2 and a vector {Xi,X 2 ) is 
written as {X,Y). Recall that all random variables are non-negative with the 
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distribution function F and regularly varying with the same index —a. Assume 
that we have an i.i.d. sample {Xj,Yj), j = from the distribution of 

{X,Y). Let ijj : ^ ]R_|_. In what follows Un denotes a scaling sequence, that 

is the sequence such that —>■ oo and nF{un) —>■ oo. For sq > 0, define the 
tail empirical function 


T„(s;V',C) 


1 

nFx{u„) 


n 




Un J 


^{{Xj,Yj)&su^C} , 


S> So , (10) 


and T„(s; -ip, C) = E[T„(s; ip, C)]. If ip is homogeneous with index 7 then Lemma 
[3 implies 


T{s;C,ip) = \im Tn{s;C, Ip) = s'^ “ / ^ 2 ) 1 ^ (dui, du 2 ) , (11) 

n-*-oo 

whenever ip satisfies the appropriate integrability condition (see (l39l) below). 
Consider the tail empirical process 

Gn{s-,ip,C) = \JnF{un) ^fn{s-,ip,C) - r„(s;V’,C')| . (12) 

Also, define G*(-) to be the process Gn{-',ip,C) for the function ip = 1 and the 
set C = {{xi,X 2 ) : xi > 1}. 

The main result of this section is the following weak convergence for the tail 
empirical function. A proof is given in Section [51 

Theorem 4. Let Sq > 0. Assume that {Xj,Yj) are i.i.d. regularly varying 
random vectors with non-negative regularly varying components with index —a. 
If moreover 

1 . t c» and nF(un) —>■ 00 ; 

2. The function ip is homogenous with order 7 S R; 

3. For 0 < So < s < t we have tC C sC; 

4 . There exists (5 > 0 such that J^ip‘^~^^{vi,v 2 )i'{dvi,dv 2 ) < 00 ; 
then 


(G:(-), Gu{-; V', C)) => (G*(-), G(-; ip, C)) (13) 

in D([so,oo)) X D([so,oo)), where G*{-), G{-;ip,C) are Gaussian processes with 
the covariance functions 


cov(G*(s),G*(t)) = (sVt)"“ , 

cov{G{s;ip,C),G{t;ip,C)) = {sV t)'^'^~°‘ [ ip'^{vi,V 2 )vidvi,dv 2 ) ■ 

Jc 
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3.1 Tail empirical process with random levels 

To apply the weak convergence established in Theorem|3]one needs to choose u„. 
The sequence depends on the marginal distribution which is unknown. Hence, 
we consider the tail empirical process with random levels. We refer the reader 

to [IS] and [TU] . 

The second issue is that the centering in the tail empirical process (HU) is 
Tn{s]ip,C) not its limit T{s;ip,C). This will be handled by an appropriate 
” no-bias” condition. 

To proceed, choose a sequence k = kn such that A: —>■ oo and k/n ^ 0 and 
define it„ by fc = nF(un)- Let Xn-.i < < ••• < Xn-.n be order statistics 

from Xj, j = 1,... ,n. First, from Theorem |4] we conclude the following weak 
convergence. Let r„(s) = F(sUn)/F{un)■ 

Corollary 5. Assume that the conditions of Theorem^ are satisfied. Further¬ 
more, assume that the distribution function F is continuous and that 


lim Tf{s) = —as “ ^ 


(14) 


uniformly in a neighborhood of 1. Then 



We note that the normal convergence of the order statistics is standard (see 
e.g. [S] Theorem 2.4.1]), but we need to argue that the convergence holds jointly. 
Furthermore, we impose the following no-bias condition: 


lim sup \/fc|T„(s; C) — T(s; (7)1 = 0 . 


(15) 


This leads to the following empirical processes 




where 



(16) 


and 
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Theorem 6. Assume that the conditions of Theorem^ are satisfied. Further¬ 
more, the distribution function F is continuous and holds. Then 

Gn{s; C) => G(s; fj, C) + sT'-“-T'(1; fj, C)G*(1) , (17) 

a 

and 

G„(s; V’, C) ^ G(s; if, C) + fj, C')G*(1) - V’, C)G*(1) . 

a a 

(18) 

in D([so,oo)). If moreover 1151) is satisfied, then the centering Tn{s',ip,C) can 
be replaced with its limit T{s;'ip,C). 


4 Conditional tail distribution 

If we choose t/j = 1 and C = {(xi,X 2 ) : xi > 1,X2 > y}, y > 0, then Th(s;'0,(7) 
in (US becomes 


r«(s;y) 


1 

nF{un) 


n 

^ ^ ^{Xj>SUn,Yj>SUnV} ' 

i=i 


(19) 


Furthermore, 


P(X>M„) 7(l.oo]x(j/,oo] 


Hence, T(^)(1 ;j/) = lim„_>oo P(F > UnV \ X > m„) is the limiting conditional 
tail distribution and T)^^(1; 1) is the tail dependence coefficient. We note that 
in terms of the quasi-spectral representation the limiting variance is 


s-“E 



( 20 ) 


If we choose ip{xi,X 2 ) = {x 2 /{yxi) A 1)“ and C = {{xi,X 2 ) : xi > 1} then 


Ti^\s-y) 


1 

nF{un) 



Yj 

yXj 


Y 

All '^{Xj>su„} j 


( 21 ) 


tYYv) 


1 

F{Un) 


■E 





In particular, using 0, 


Y^\s;y)= hm TY{s;y) . 


r(")(l;y) = E 



lim P(y > UnV \ X > Un) . 

n—^oc 
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Theorem[3] implies that \/nF{un) {s',y) — Tn'^ (s; y) | converges to a Gaus¬ 
sian process (s; y) with the limiting variance 


s-“E 



( 22 ) 


which is smaller than the one given in (1201) whenever y > 1. 

Hence, both tail empirical functions in (fTOl) and (1^ can be used to construct 
estimators of the limiting conditional tail distribution. Specifically, we can use 




1 

k 




(23) 


1 _ ” Z' Y \ “ 

= (24) 

the latter one when a is known. The above discussion indicates that the second 
estimator can be asymptotically more efficient than the first one. 


4.1 Unknown a 

Let a be an estimator of a. We redefine Tn^\l;y) from (l24l) as 

1 _ / V \ “ 

d*'-(i;!/) = j:g(^Ai) 1,(2.5) 

We have 



We already know (cf. (1X71) 1 that 

U2iy) ^ G(2)(l;y) + a-l(T(2))'(l;y)G*(l) . 


Using the first order Taylor expansion for a —>■ z“, we have 








U 


yXj 
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Let k = o(k) and be the Hill estimator. We know that — a) converges 

to a normal random variable. Hence, in order to show that Ui is of a smaller 
order than U 2 {y) it suffices to justify that 


iy 


Y, 




A 1 log 


yXj 


A 1 


is bounded in probability,uniformly in y. Assume that for (5 > 0 we have 
E [(02 A 1)“+^| log(02 A l)|y < oo . 

Then recalling that k = nF{un) and Xn-.n-bjun 1, 


lim sup E 


-y 

k 


Y 




A 1 


log 


yx, 


A 1 




<E[(02 Al)“|log(02 Al)|] . 


Hence, Ui is negligible and there is no effect of estimation of a. 


5 Conditional Tail Expectation 

If we choose tf}{xi,X 2 ) = X 2 and C = {(a:i, a; 2 ) : cci > 1} then Tn{s] ■*/', C) in (fTOj) 
becomes 


1 _ V 


(26) 




F{Un) 


-E 


Y 


-1 




,r^^^(s) = s^ “ f n2i^(dui,dn2) 

J (l,cso] X (0,cso] 


We note that the limiting variance can be represented as 


o2-a 


Q 


a-2 


E[02] 


(27) 


If we choose ip{xi,X 2 ) = and C = {(a;i,X 2 ) : xi > 1} then 

1 ^ 

1 , (28) 
nr [Unj a — I “ Xj 


r(4)(s) 


lim E[T([‘*^(s)] = s 


a — 1 


> (l,oo] X (0,oo] 


—i^(dui,dz;2) . 
Vl 


In particular, by (|9]) 


J^(4)(i) 



Y , 


■y , , 

lim E 

”77 ^ ^ 

= lim E 

1 ^ ^ Un 

n—>-oo 

X 

n—^oc 

Un 


(29) 


II 






















We have furthermore 


var(G('^)(s)) = s ° f [ vl ( (dui,du 2 ) (30) 

va- 1 / Ao.oo) Ji n 

f vl ru{dv,,dv 2 ). (31) 

\a-lj 7(0,oo) Ji 

The integral in (1311) is finite whenever a > 2. However, the integral in (1301) may 
exists even when a < 2 (take trivially the situation ofT = XorY = (j)X + a\Z\, 
where 7 ) > 0 , X is regularly varying with index —a and support contained in 
(e, oo), e > 0 , independent of a standard normal random variable Z.) 

The limiting variance can be written as 

(32) 

We note that for s = 1 the limiting variance in (l32l) is smaller than the one in 
dUl). Furthermore, the effect of estimating a is negligible if we use an estimator 
of a with a faster rate of convergence, as described in Section HTTl 

5.1 Modelling Conditional Tail Expectation 

Let U{t) = F'*“(l — 1/t) be the upper quantile function. For a small p € (0,1) 
we have P(lf > U{l/p)) = p. Our goal is to estimate 

0(p) = E [Fi I > Uil/p)] 

when p is small. In case of extremal dependence we have (cf. ([9])) whenever 
P -t 0 , 


d(p) Ri HcteCx(1/p) , (33) 

where Hcte = liina:->-oo a;“^E[Yi \ Xi > x] € (0,oo). If we model the tail by a 
generalized extreme value distribution, then U{l/p) can be estimated using the 
representation (5.9) in [2], while Hcte can be estimated using the tail empirical 
functions ((^ and (1^ as follows. We take s~^fn\s) and Tn\s) and then 
replace s with Xn-.n-k/un to obtain 


= T'f ^(1) = 


iy. 

U / j 


Yj 


-1 


k . Xn:n—k 

J=1 




‘'GTE ~ 


fW(l) 


1 d 
k d — 1 



(34) 

(35) 


Then, Hcte can be chosen to be one of the estimators defined in (I34l) - (l35]) . 

Let now X be regularly varying. The function U is regularly varying as 
p —^ 0. If Fn,x is the empirical distribution function associated with Xi,..., Xn 


12 







and we set [/„ ,x(t) = Fn, x{\ — 1/t), then Un{n/k) = Xn-.n-k- Thus, when 
n/k K, 1/p, we have the following approximation (see [11 p. 119]): 

e{p) ^ HcTEt/x(l/p) ~ HcTE?7x(n/A:) f—V ■ (36) 

\npj 

Hence, we can estimate 


0(p) = HcTE-^n:rt-fc 



where a is an estimator of a. 

Equation (IMl) leads to the following estimators of 9{p): 


0(3) (p) 


1 

i=i 



1 

■]:'^^ 3 '^{X,>X„..„_k} X 
(37) 


We note that (1571) is precisely the estimator used in [?] and our Theorem [5] can 
be used to conclude their Theorem 1 under slightly different conditions. Indeed, 
using dMl) and noting that U(n/k) = we have 


\/k ■ 


'0(3)(p) 

0(p) 


- 1 


\/k 


H 


(3) 


CTE-^n:n-/c .. 

u^Mk) - 


l^CTE 


{-^)"^^Ux{n/k) 
Ux{l/p) 


'/k • 


- 1 




(3) 


'^CTEUx{n/k) 


(38) 


^ ( 3 ) 

We can recognize 


jUxinlk) to be 


1 

k 


E 

1=1 


w 




and its convergence can be concluded from (EZD, while the bias term in (1551) can 
be handled by imposing a second order condition as in [3]. 

Now, the case of estimated a in 6 ^^\p). Applying the first order Taylor 
expansion, we have 


so that 




l/a 
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If for some (5„ —>■ oo and a random variable A we have 



and lim„_>oo = r G [ 0 , oo), then estimation of a yields an addi¬ 

tional contribution rA. This is exactly the situation of Theorem 1 in [4], how¬ 
ever note that they did not require that the vector {X,Y) is regularly varying. 
Nevertheless, their Theorem 1 can be recovered from our results. 


6 Implementation. Simulation studies 

We perform simulation studies to illustrate our theoretical results. We illustrate 
estimation of the tail dependence coefficient 

TDC := lim P(y > x \ X > x) . 

x—>-oo 

We use the estimators defined in (l23l) . (l24l) . 

(|25jl . At the first step we plot estimates computed for different numbers k of 
order statistics. Next, we conduct Monte Carlo estimation for particular choices 
of k (5%, 10%, 20%, 30% and 40% of observations). Number of Monte Carlo 
iterations is chosen to be 1000 . 

Our simulations indicate that the quasi-spectral method is less variable more 
robust (in terms of the choice of k) than the standard empirical method, even 
if the parameter a has to be estimated. 

6.1 A toy example: simple linear model 

We simulate 1000 observations from the model Y = (j)X + a\Z\, where (j) G (0,1), 
cr > 0, X is standard Pareto with a > 0 and Z is standard normal. In this case 
the tail dependence coefficient is 4 >°‘. 

Figure 1 shows shows the estimated values using the three estimators, com¬ 
puted for different values of fc, where k is the number of order statistics being 
used. On the x-axes actual values of order statistics ..., A„:„ are plot¬ 
ted in the increasing order. Hence, the estimators computed at the left-end of 
each picture use a large number of order statistics, while at the right-end use 
few order statistics. This is different as compared to the Hill plot. The first 
observation (not surprisingly) is that the empirical estimator Tn^\l; 1) is very 
sensitive with respect to the number of order statistics fc, and is completely 
useless when plotted against large values of order statistics. The estimators 
motivated by the quasi-spectral representation are more ’’stable”, even if the 
parameter a has to be estimated. 

Figures 2 and 3 show Monte Carlo estimates of TDC using 1), 1) 

(Figure 2) and Ti^^’“(l; 1) (Figure 3), where the estimators are computed based 
on k = 5%, 10%, 20%, 30%, 40% upper order statistics. The parameter a in 
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Conditional Probabilities 


Conditional Probabilities 




Order Statistics 


Order Statistics 


Figure 1: Estimation of TDC for the model Y = <f)X + (j\Z\ with cj) = 0.8, a = 4, 
cr = 0.1. The dotted line shows the true value 0“. Top line, left: estimator 1); 

top line, right: estimator 1); bottom line: estimators 1), where a is 

estimated using the Hill estimator based on 10% (left picture) and 20% (right picture) 
of order statistics. 


1) is estimated using the Hill estimator based on = 5%, 10%, 20%, 40% 
of upper order statistics. 

6.2 Bivariate t 

We simulate 1000 observations from the bivariate t-distribution, that is (X, Y) = 
\/W{\Zi\, 1^21), where a/W is chi-square with a = 4 degrees of freedom and 
(^ 1 ,^ 2 ) are standard normal with correlation (j) = 0.9. In this case the tail 
dependence coefficient is 0.63, see m • 

7 Data Analysis 

We analyse absolut log-returns of S&P500 and NASDAQ composite indices 
from January 2, 2013 until June 24, 2014. The scatter plot indicates strong 
dependence in the upper tail. This is confirmed by the estimation of the tail 
dependence coefficient. Again, the quasi-spectral method is less variable than 
the empirical one and robust with respect to the number k of the order statistics 
and estimation of a. 
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Empirical method 


Quasi-spectral method 



Figure 2: Estimation of TDC for the model Y = <f)X + (j\Z\ with cj) = 0.8, a = 4, 
(7 = 0.1. The dotted line shows the true value 4>°‘. Left panel: estimator 1); 

right panel: estimator (1; 1). Each figure shows the boxplots for estimated values of 
the conditional probability computed for five different values of k. The first boxplot is 
computed based on 40% of observations, the second one based on 30% of observations, 
and the remaining ones based on 20%, 10% and 5%. 
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Conditional Probabilities (0.05*n) 



Conditional Probabilities (0.2*n) 


CO 



1 2 3 4 5 


Conditional Probabilities (0.1 *n) 



Conditional Probabilities (0.4*n) 



1 2 3 4 5 


Figure 3: Estimation of TDC for the model Y = <f)X + (j\Z\ with 0 = 0.8, a = 4, 
a = 0.1. The dotted line shows the true value Estimators 1) computed 

for a obtained by the Hill estimator based on 5% (top left), 10% (top right), 20% 
(bottom left) and 40% (bottom right) order statistics. Each figure shows the boxplots 
for estimated values of the conditional probability computed for five different values of 
k. The first boxplot is computed based on 40% of observations, the second one based 
on 30% of observations, and the remaining ones based on 20%, 10% and 5%. 
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Conditional Probabilities 


Conditional Probabilities 



0246 0246 


Order Statistics 


Order Statistics 


Figure 4: Estimation of TDC for the bivariate t. Top line, left: estimator 1); 

top line, right: estimator bottom line: estimators 1), where a is 

estimated using the Hill estimator based on 10% (left picture) and 20% (right picture) 
of order statistics. 
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Empirical method 



Quasi—spectral method 



Figure 5: Estimation of TDC for the bivariate t. Left panel: estimator 1); right 

panel: estimator 1). Each figure shows the boxplots for estimated values of the 

conditional probability computed for five different values of k. The first boxplot is 
computed based on 40% of observations, the second one based on 30% of observations, 
and the remaining ones based on 20%, 10% and 5%. 


8 Technical Details 


We state the following lemma without a proof. 

Lemma 7. Let X be a regularly varying random vector such that all components 
are regularly varying with the same index —a. Let ij} : —>■ R+ he homogenous 
with index 7 and assume that for some <5 > 0 , 


f {v)v> (dv) < 00 . 

Jc 


(39) 


—d , 


Then for s > e and a relatively compact set (7 m M \ {0} we have 


lim 


1 


F{x) 


E 


^ ( - ) l{^6sa;C} = “ 


/ 'il){v)v{dv) 

Jc 


8.1 Proof of Proposition [T] 

Proof. Since X is regularly varying we have for A C 

f‘{x~^X S {y,oo] X A) u{{y,oo] x A) 
x^oo P(Wi > x) ^^((IjOo] X 
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Conditional Probabilities (0.2*n) 


Conditional Probabilities (0.4*n) 




Figure 6: Estimation of TDC for the bivariate t. Estimators computed 

for a obtained by the Hill estimator based on 5% (top left), 10% (top right), 20% 
(bottom left) and 40% (bottom right) order statistics. Each figure shows the boxplots 
for estimated values of the conditional probability computed for five different values of 
k. The first boxplot is computed based on 40% of observations, the second one based 
on 30% of observations, and the remaining ones based on 20%, 10% and 5%. 



Figure 7: Scatter plot for S&P vs. NASDAQ 
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Figure 8: Estimation of TDC for S&P and NASDAQ. Left plot: empirical method; 
middle plot: quasi-spectral method with ko, = O.ln; right plot: quasi-spectral method 
with ka = 0.2n 


If moreover y > 1, the left hand side becomes the conditional probability 
lim P(a;“^X G (y, oo] x A | Xi > x) . 

X—>-oo 

In other words, conditionally on Xi > x, x~^X converges weakly to a random 
vector, say V = (Vi,...,t4)- Therefore, for any / : R bounded and 

continuous we have 


lim E [/ (x-^X) \X,>x]= E[f{V)] . 


Now, let y : R'^ —)> R be bounded and continuous. Then 


E 




= E 


X X 



Xi> X 


where /(iti,... ,Ud) = g{ui,U2/ui, ... ,Ud/ui) is also bounded and continuous 
whenever iti > I. Hence, 


lim E 

X—>-oo 


9 


X ’ ’ ■ • ■ ’ X J 


> X 


E[g{Vi,V 2 /Vi,...,Vd/Vi)] . 


Hence, conditionally on Xi > x. 


V X 


converges in distribution to (Vi, V 2 /V 1 ,..., VdIVi) = (Vi, 02, ■. ■, ©d)- It is obvi¬ 
ous that Vi has a standard Pareto distribution. We claim that Vi is independent 
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of (02,..., 0d)- Indeed, for C K, i = 2,..., d, 




^ & A2, - & Ad \ Xi > xy 

V 2 


P(Xi > xy) 
P(Xi > x) 


Vd 


€ A2, - ■ ■ , y & Ad] ¥(yi > y) , X -)■ 00 . 


On the other hand. 


{X 




Xd 


lim P — > y, — € A 2 ,-- ■ , — € Ad\ Xi > x] = ¥ [Vi > y, — € A 2 , - ■■ € 


x—¥(X) \ X 


Xi 


Xi 


V2 


Vd 


Vi 


Vi 


Hence, (02 = V 2 /V 1 ,..., 0^ = VdiVi) and Vi are independent. 


□ 


8.2 Proof of Theorem [3] 

The proof is relatively standard, but we provide it for completeness. We start 
with the central limit theorem. Multivariate convergence follows by the Cramer- 
Wald device. We prove the result only for G„(-; '0, C). 

Lemma 8. Under the conditions of Theorem^ for each s > sq, G„(s;'0, C) 
converges in distribution to a centered normal random variable. 

Proof. We prove the central limit theorem by checking Lindeberg’s conditions. 
Let 

Zn.j{s]C) = (tf 

y Tlr [Uji) L \ / [ \^n / 

so that ^^(s;^) = Y^^=i Zn,j{s\C). Clearly, E[Z„^j(s; C)] = 0. Furthermore, 


^SUn 


var(G„(s;0,C')) = 


1 


-E 


F{u, 

- F{Un) 


r 


X Y 

Un '^n 


l{(X,F)esUnC} 


F{Un) 


E 


X Y 

V I —5 — ) ^{{x,Y)esunC} 


Zln Ur 


n \ 2 


Since F(un) —>■ 0 as n —>■ 00 , Lemma [7] implies that the first term dominates 
and lim„_,.oo var(G„(s; ip, C)) exists. 

Furthermore, noting that for arbitrary <5 > 0 and any random variable 
l{|Y|>c} < \Y\^/c\ we have 




2 + 5 ] 


< 


K 


{nF{un)Y+V^ 


E 


r 


2+5 (Xl X 2 


1{(J(:i,X2)gs«„c} 
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and hence 




i=i 


F{Ur 


-E 


i~^~] MiXi,X2)esu„c} 


Using Lemnia[7]and since i5 > 0, the expression on the right hand side converges 
to 0. □ 


Lemma 9. Under the conditions of Theorem^the sequence of processes {Gni-lipj C')}, 
n> 1, is tight in D([so,oo)) equipped with the Skorokhod topology. 

Proof. In what follow, since the set C is fixed, in our notation we omit a de¬ 
pendence on it, unless it is necessary. For sq < s < t, define {s,t]unC = 
(sUnC) \ (tUnC) and 


Un,jis) = 


^{{X^,Yj)esu^C} , 


U:^^{s) = Ur.,j{s)-E[UnAs)], 


Un,j(5,t) — Uji^ji^s') Ujijlf') , 


UAAt) = u:As)-UlAt), 


1 

F{Un) 


gAs;m)= ^ E[|Unj-(s)p] , gn{s,t;rn) = gn{s;m) - gn{t;m) . 


We note that lim„_).oo gn{s; m) = uniformly on [sq, oo). Then 


Gn(s) — Gn(t) — 


AnF(un) jA 




where we write shortly G„(s) for G„(s;'0,C'). We use Theorem 13.5 in [3]. For 
So < Si < t < S 2 we have 

E [|G„(si) - G„(t)nG„(t) - G„(s2 )P] 

n .j n 

j2Ei(u:Asi,t)u:At,s2)y] + -——j2n(u:,A^iA)fm(u:At,s2)f]. 

■ 1 {nr [Un J j 

i/i 

(40) 


{nF{un)Y 


By noting that for si < t < S 2 we have Un,j{si,t)Un,j{t, S 2 ) =0, we evaluate 

{uA{siA)KAus2)f 

= Un,j{siU)EA[Un,j{t, S2)] + S 2 )]E^[t/„j(si,t)] 

- 2U„j(si,t)E[U„j(si,t)]E^[[7„j(<, S 2 )] - Ss)E[t7„j(t, Ss)]E^[[7„j(si,t)] 

-h E^[t/„j(si,t)]E^[C/„,j(<, S2)] , 
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so that 


F^{Ur.) 


E 


iU:jsut)U:jt,S2)f < 3F(u„)5^(si,S2;1) 


Next, we deal with the second term in (HOI) . For s < t we have 


< 4E[(C/„,,(s) - U^At))"] . 


Hence, the term is bounded by 

^-^7-^IE[{7*_i(si,t)]E[t/*_i(<,S2)] = Kgl{si,S2\2) . 

[TLr [Uyi)) 

The tightness follows. □ 

8.3 Proof of Corollary [5] 

The argument is similar to that of [16]. 

• By Theorem |4| and the Skorokhod representation theorem, there exists 

a probability space, a sequence of processes {G* (•), Gn(-; V'; C)} and pro¬ 
cesses G*(-), G(-; V', C) with the same distributions as, respectively, {G* (•), 
G*(-) and such that 

G:(-)^G*(-), G„(-;V^,C')^G(-;V',C) (41) 

almost surely, uniformly on compact subsets of [soiOo). In what fol¬ 
lows, for simplicity of notation we will write Gra(')i C), G(-) and 

G(-;^,C). 

• Let and {Tn)'^ be the right continuous inverses of T„ and Tn, re¬ 
spectively. Then, r,)“(l) = 1, (fk)‘*“(l) = Xn-.n-klun and, since F is 
continuous, for all s £ [F(0 )/F(m„), 0 ], Tn{T^{s)) = s. 

• The (random) functions G* and belong to ID. Furthermore, their 
almost sure limits G* and are continuous and is strictly decreasing. 
Hence, the convergence 60 and Theorem 3.1 in m imply that 

G:(Tr(s)) = oTr(s) -s}^ G*(r-(s)) 

almost surely, uniformly on compact subsets of [sq, c»). 

• Vervaat Lemma m Lemma A.0.2]) implies that 

V^{(fk o Tr)^(s) -s]^ -G*(r-(s)) 
almost surely, uniformly on compact subsets of [sq, oo). 
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• Assumption (1141) implies that T„ is continuous and strictly decreasing in 
a neighborhood of 1. Thus, there exists e > 0 such that T„ o (T’„)^(s) = 
{fn o )■*“ (s) for s e (1 - e, 1 + e) and 

'/k {r„ o (f„)^(s) - s} ^ -G*(T^(s)) , (42) 

almost surely uniformly with respect to s G (1 — e, 1 + e). 

• Since k ^ oo and (T„)^(l) = Xn-.n-klun, dH]) implies that Tn{Xn:n-k/Un) 
converges almost surely to 1. Since T(l) = 1 and converges uniformly 
to T in a neighborhood of 1, this implies that Xn-.n-k/un converges almost 
surely to 1 . 

• By Taylor’s expansion, there exists such that — 1| < |(T'„)^(1) — 1| 
and 

r„((f„)^(l)) - 1 = T„((f„)^(l)) - T„(Tr ( 1 )) 

= T;(<;„){(f„)^(l)-Tr(l)} 

= T^{<^r.){X„,„_k/un-l} ■ (43) 

• Thus, (ITT)) . (IT^ and (IT51) yield that 

^( Xn-.n-k 1 g*( 1 ) , ( 44 ) 

I Un J a 


almost surely. 

• Since the convergences G„(-;i/', C) —>■ G{-;ip,C) and (l44ll hold almost 
surely, they hold jointly. Coming back to the original probability space, 
we obtain the joint weak convergence. 

8.4 Proof of Theorem [6] 

Proof. Denote Tn{s;if^C) = Tn{sXn-.n-k/un','4>,C), where T„ and Tn are the 
tail empirical functions defined in TO and dm), respectively. Then, by the 
homogeneity property dm, 

+ S-^-^Vk {T{Xn-.n-k/Un, , C) - Til' , C)} = h{s) + . 

By Corollary [5] 


v^/^^^-ll4-G*(l), 

[ Un j CK 


(45) 
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jointly with G„(-; ijj, C). In particular, Xn-.n-klun converges in probability to 1. 
Thus, by Theorem |4l the term Ii converges weakly to while by the 

delta method the term / 2 (s) converges weakly to 


a 


This finishes the proof of (ED- Furthermore, 


G„(s; C) 


\ Un J 



{'^n{s^n:n—kl '^ni '4^ t C') T{sXn:n—kl'^n] l/’? G)} 


n\n— 


n:n— 


Un J J 

+ {TiXn-.„-k/un; V', C) - r(l; C)} 



n'.n— 



Again, by Theorem 0] and Xn-.n-kjun 1, the first term converges weakly to 
G(-;V’,C'). The second term vanishes by (I15|) . Furthermore, the delta method, 
the first order Taylor expansion of T{-;ip,C) around 1 and (l4^ yield that 
converges to 


r(l; V', CIG*!!) + C)G*(1) . 

a 



The convergence (flSll is proven. 


□ 


9 Additional comments and future research 

We finish our paper by addressing several technical issues and discussing direc¬ 
tions of future research. 

1. We assume regular variation of a vector {X, Y) since we work under general 

framework of estimating O- In specific examples, like conditional tail 
expectation, it is enough to assume that the limit lim 2 ,_,,oo | X > 

x] exists and is strictly positive. This is done precisely in [1]. 

2. In expense of additional technical considerations one can study tightness 
with respect to a class of sets C £ C, which in particular will imply 
tightness with respect to y in case of the conditional tail distribution. 

3. The results are meaningful in case of extremal dependence, that is when 
the exponent measure is not concentrated on axes. In case of extremal 
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independence, if one wants to estimate quantities like the conditional tail 
distribution or conditional tail expectation, a different scaling is required. 
We will address this issue in a following paper, based upon the ideas 
developed in 0, i, mm- 

4. The quasi-spectral method should be compared with semiparametric or 
parametric ones. It could be particularly attractive in case of time series 
where very few parametric models for multivariate extremes are available. 

5. We would like to address estimation of conditional tail expectation in a 
context of multivariate time series, using the tools developed in [^. 

6. It is a common practice in extreme value theory to standardize marginals. 
Assume that we have a positive bivariate vector (X, Y) with marginal 
distribution functions Fx and Fy. Define 



Then Z = Q^{X) and W = Qp(Y) are standard Pareto. All results 
in the paper remain valid if one assumes that {Z, W) is regularly varying 
(with index a = 1). If iV{,V{Q 2 ) is the quasi-spectral decomposition of 
{Z, W), then V( is standard Pareto, however 02 still contains information 
about the marginal behaviour. For example, if we start with {X, Y) being 
regularly varying with —a and (Vi,Vi 02 ) is its quasi-spectral decompo¬ 
sition, then 02 = 02. In other words, by transforming marginals we do 
not avoid the problem of estimating a in ([24|) . 
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